13 research outputs found
Visual Place Recognition for Autonomous Robots
Autonomous robotics has been the subject of great interest within the research community over the past few decades. Its applications are wide-spread, ranging from health-care to manufacturing, goods transportation to home deliveries, site-maintenance to construction, planetary explorations to rescue operations and many others, including but not limited to agriculture, defence, commerce, leisure and extreme environments. At the core of robot autonomy lies the problem of localisation, i.e, knowing where it is and within the robotics community, this problem is termed as place recognition. Place recognition using only visual input is termed as Visual Place Recognition (VPR) and refers to the ability of an autonomous system to recall a previously visited place using only visual input, under changing viewpoint, illumination and seasonal conditions, and given computational and storage constraints.
This thesis is a collection of 4 inter-linked, mutually-relevant but branching-out topics within VPR: 1) What makes a place/image worthy for VPR?, 2) How to define a state-of-the-art in VPR?, 3) Do VPR techniques designed for ground-based platforms extend to aerial platforms? and 4) Can a handcrafted VPR technique outperform deep-learning-based VPR techniques? Each of these questions is a dedicated, peer-reviewed chapter in this thesis and the author attempts to answer these questions to the best of his abilities.
The worthiness of a place essentially refers to the salience and distinctiveness of the content in the image of this place. This salience is modelled as a framework, namely memorable-maps, comprising of 3 conjoint criteria: a) Human-memorability of an image, 2) Staticity and 3) Information content. Because a large number of VPR techniques have been proposed over the past 10-15 years, and due to the variation of employed VPR datasets and metrics for evaluation, the correct state-of-the-art remains ambiguous. The author levels this playing field by deploying 10 contemporary techniques on a common platform and use the most challenging VPR datasets to provide a holistic performance comparison. This platform is then extended to aerial place recognition datasets to answer the 3rd question above. Finally, the author designs a novel, handcrafted, compute-efficient and training-free VPR technique that outperforms state-of-the-art VPR techniques on 5 different VPR datasets
Sensors, SLAM and Long-term Autonomy: A Review
Simultaneous Localization and Mapping, commonly known as SLAM, has been an active research area in the field of Robotics over the past three decades. For solving the SLAM problem, every robot is equipped with either a single sensor or a combination of similar/different sensors. This paper attempts to review, discuss, evaluate and compare these sensors. Keeping an eye on future, this paper also assesses the characteristics of these sensors against factors critical to the long-term autonomy challenge
CoHOG: A Light-Weight, Compute-Efficient, and Training-Free Visual Place Recognition Technique for Changing Environments
This letter presents a novel, compute-efficient and training-free approach based on Histogram-of-Oriented-Gradients (HOG) descriptor for achieving state-of-the-art performance-per-compute-unit in Visual Place Recognition (VPR). The inspiration for this approach (namely CoHOG) is based on the convolutional scanning and regions-based feature extraction employed by Convolutional Neural Networks (CNNs). By using image entropy to extract regions-of-interest (ROI) and regional-convolutional descriptor matching, our technique performs successful place recognition in changing environments. We use viewpoint- and appearance-variant public VPR datasets to report this matching performance, at lower RAM commitment, zero training requirements and 20 times lesser feature encoding time compared to state-of-the-art neural networks. We also discuss the image retrieval time of CoHOG and the effect of CoHOG's parametric variation on its place matching performance and encoding time
Memorable Maps: A Framework for Re-defining Places in Visual Place Recognition
This paper presents a cognition-inspired agnostic framework for building a map for Visual Place Recognition. This framework draws inspiration from human-memorability, utilizes the traditional image entropy concept and computes the static content in an image; thereby presenting a tri-folded criterion to assess the 'memorability' of an image for visual place recognition. A dataset namely 'ESSEX3IN1' is created, composed of highly confusing images from indoor, outdoor and natural scenes for analysis. When used in conjunction with state-of-the-art visual place recognition methods, the proposed framework provides significant performance boost to these techniques, as evidenced by results on ESSEX3IN1 and other public datasets
Memorable Maps: A Framework for Re-defining Places in Visual Place Recognition
This paper presents a cognition-inspired agnostic framework for building a map for Visual Place Recognition. This framework draws inspiration from human-memorability, utilizes the traditional image entropy concept and computes the static content in an image; thereby presenting a tri-folded criteria to assess the `memorability' of an image for visual place recognition. A dataset namely `ESSEX3IN1' is created, composed of highly confusing images from indoor, outdoor and natural scenes for analysis. When used in conjunction with state-of-the-art visual place recognition methods, the proposed framework provides significant performance boost to these techniques, as evidenced by results on ESSEX3IN1 and other public datasets
Levelling the Playing Field: A Comprehensive Comparison of Visual Place Recognition Approaches under Changing Conditions
In recent years there has been significant improvement in the capability of
Visual Place Recognition (VPR) methods, building on the success of both
hand-crafted and learnt visual features, temporal filtering and usage of
semantic scene information. The wide range of approaches and the relatively
recent growth in interest in the field has meant that a wide range of datasets
and assessment methodologies have been proposed, often with a focus only on
precision-recall type metrics, making comparison difficult. In this paper we
present a comprehensive approach to evaluating the performance of 10
state-of-the-art recently-developed VPR techniques, which utilizes three
standardized metrics: (a) Matching Performance b) Matching Time c) Memory
Footprint. Together this analysis provides an up-to-date and widely
encompassing snapshot of the various strengths and weaknesses of contemporary
approaches to the VPR problem. The aim of this work is to help move this
particular research field towards a more mature and unified approach to the
problem, enabling better comparison and hence more progress to be made in
future research
Sequence-Based Filtering for Visual Route-Based Navigation: Analysing the Benefits, Trade-offs and Design Choices
Visual Place Recognition (VPR) is the ability to correctly recall a
previously visited place using visual information under environmental,
viewpoint and appearance changes. An emerging trend in VPR is the use of
sequence-based filtering methods on top of single-frame-based place matching
techniques for route-based navigation. The combination leads to varying levels
of potential place matching performance boosts at increased computational
costs. This raises a number of interesting research questions: How does
performance boost (due to sequential filtering) vary along the entire spectrum
of single-frame-based matching methods? How does sequence matching length
affect the performance curve? Which specific combinations provide a good
trade-off between performance and computation? However, there is lack of
previous work looking at these important questions and most of the
sequence-based filtering work to date has been used without a systematic
approach. To bridge this research gap, this paper conducts an in-depth
investigation of the relationship between the performance of single-frame-based
place matching techniques and the use of sequence-based filtering on top of
those methods. It analyzes individual trade-offs, properties and limitations
for different combinations of single-frame-based and sequential techniques. A
number of state-of-the-art VPR methods and widely used public datasets are
utilized to present the findings that contain a number of meaningful insights
for the VPR community.Comment: 7 pages, currently under-revie
Are State-of-the-art Visual Place Recognition Techniques any Good for Aerial Robotics?
Visual Place Recognition (VPR) has seen significant advances at the frontiers of matching performance and computational superiority over the past few years. However, these evaluations are performed for ground-based mobile platforms and cannot be generalized to aerial platforms. The degree of viewpoint variation experienced by aerial robots is complex, with their processing power and on-board memory limited by payload size and battery ratings. Therefore, in this paper, we collect state-of-the-art VPR techniques that have been previously evaluated for ground-based platforms and compare them on recently proposed aerial place recognition datasets with three prime focuses: a) Matching performance b) Processing power consumption c) Projected memory requirements. This gives a birds-eye view of the applicability of contemporary VPR research to aerial robotics and lays down the the nature of challenges for aerial-VPR